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APPROXIMATIONS OF M/PH/N + M SYSTEMS* 

By Anton Braverman and J. G. Dai 
Cornell University 

We consider M/Ph/n + M queueing systems in steady state. We 
prove that the Wasserstein distance between the stationary distribu¬ 
tion of the normalized system size process and that of a piecewise 
Ornstein-Uhlenbeck (OU) process is bounded by C /\f\, where the 
constant C is independent of the arrival rate A and the number of 
servers n as long as they are in the Halfin-Whitt parameter regime. 

For each integer m > 0, we also establish a similar bound for the dif¬ 
ference of the mth steady-state moments. For the proofs, we develop 
a modular framework that is based on Stein’s method. The frame¬ 
work has three components: Poisson equation, generator coupling, 
and state space collapse. The framework, with further refinement, is 
likely applicable to steady-state diffusion approximations for other 
stochastic systems. 


1. Introduction. This paper focuses on M/Ph/n + M systems, which serve as build¬ 
ing blocks to model large-scale service systems such as customer contact centers [1, 23] 
and hospital operations [2, ]. In such a system, there are n identical servers, the arrival 

process is Poisson (the symbol M ) with rate A, the service times are i.i.d. having a phase- 
type distribution (the symbol Ph ) with mean l//r, the patience times of customers are 
i.i.d. having an exponential distribution (the symbol +M) with mean 1/a < oo. When the 
waiting time of a customer in queue exceeds her patience time, the customer abandons the 
system without service; once the service of a customer is started, the customer does not 
abandon. 

Let Xi(t) be the number of customers in phase i at time t for i = 1,..., d, where d is the 
number of phases in the service time distribution. Let X{t) be the corresponding vector. 
Then the system size process X = {X(t),t > 0} has a unique stationary distribution for 
any arrival rate A and any server number n due to customer abandonment; although X is 
not a Markov chain, it is a function of a Markov chain with a unique stationary distribution, 
see Section 4 for details. In this paper, we prove, in Theorem 1, that 


( 1 . 1 ) 


sup 

hen 


E[/i(A( a )(oo))] — E [h(Y (oo))] 


< 


C_ 

71 


for any A > 0 and n > 1 
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satisfying 

(1.2) n/i = A + /3VX, 

where /3 € M is some constant and T~L is some class of functions h : W l —>• M. In (1.1), 
_A^ a )(oo) is a random vector having the stationary distribution of a properly scaled version 
of X = X that depends on the arrival rate A, number of servers n, the service time 
distribution, and the abandonment rate a, and T(oo) is a random vector having the sta¬ 
tionary distribution of a piecewise Ornstein-Uhlenbeck (OU) process Y = {Y(t),t > 0}. 
The stationary distribution of X^ exists even when f3 is negative because a is assumed to 
be positive. The constant C depends on the service time distribution, abandonment rate 
a, the constant /? in (1.2), and the choice of "H, but C is independent of the arrival rate 
A and the number of servers n. Two different classes H will used in our Theorem 1. First, 
we take T~L to be the class of polynomials up to a certain order. In this case, (1.1) provides 
rates of convergence for steady-state moments. Second, 'H is taken to be , the class of 
all 1-Lipschitz functions 

(1.3) W (d) = {h : R d ->• R : | h{x) - h(y) | < \x - y\}. 

In this case, (1.1) provides rates of convergence for stationary distributions under the 
Wasserstein metric [47]; convergence under Wasserstein metric implies the convergence in 
distribution [24]. 

In [ 4 ], an algorithm was developed to compute the stationary distribution of the dif¬ 
fusion process Y. The distribution of Y(oo) is then used to approximate the stationary 
distribution of X^ x \ The approximation is remarkably accurate; see, for example, Figure 
1 there. It was demonstrated that computational efficiency, in terms of both time and 
memory, can be achieved by diffusion approximations. For example, in an M/H 2 / 5 OO + M 
system studied in [ 4], where the system has 500 servers and a hyper-exponential service 
time distribution, it took around 1 hour and peak memory usage of 5 GB to compute the 
stationary distribution of X^> using an algorithm that fully explores the special structure 
of a three-dimensional Markov chain. On the same computer, to compute the stationary 
distribution of the corresponding two-dimensional diffusion process it took less than 1 
minute and peak memory usage was less than 200 MB. The computational saving by the 
diffusion model is achieved partly through state space collapse (SSC), a phenomenon that 
causes dimension reduction in state space. Theorem 1 quantifies the steady-state diffusion 
approximations developed in [ ]. 

In [29], the authors prove a version of (1.1) for the M/M/n + M system, a special case 
of the M/Ph/n + M system where the service time distribution is exponential. They do 
not impose assumption (1.2) on the relationship between the arrival rate A and number of 
servers n, resulting in a universal approximation that is accurate in any parameter regime, 
from underloaded, to critically loaded, and to overloaded. To our knowledge, this is the first 
paper to study convergence rates of steady state diffusion approximations. Their method 
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relies on analyzing excursions of a one-dimensional Markov chain and the corresponding 
diffusion process. It is unclear how to generalize their method to the multi-dimensional 
setting. 

To prove Theorem 1, we develop a framework that is based on Stein’s method [49, 50]. 
The framework is modular and relies on three components: a Poisson equation, generator 
coupling, and SSC. The framework itself is an important part of our contribution, in ad¬ 
dition to Theorem 1. We expect the framework will be refined and used to prove rates of 
convergence of steady-state diffusion approximations for many other stochastic systems. 
This framework is closely related to a recent paper [ ] by Gurvich. We will discuss his 

work after giving an overview of the framework. 

We consider two sequences of stochastic processes and (5^)}°°^ indexed by 

£, where = {X^(t) € R d ,t > 0} is a continuous-time Markov chain (CTMC) and 
yR) _ g > 0} is a diffusion process. Suppose xW(oo) and Y^(oo) are two 

random vectors having the stationary distributions of jW and Y^\ respectively. Let G X (t) 
and G Y w be the generators of X^ and Y^\ respectively; for a diffusion process, G Y (i) is 
the second order elliptic operator as in (5.3). For a function h : R d —> R in a ’’nice” (but 
large enough) class, we wish to bound 

Eh(X^( oo )) -E h(Y {l \ °o))| . 

Component 1. The first step is to set up the Poisson equation 

(1.4) Gy W f h (x ) = h(x) - m(yW(po)) 

and obtain various estimates of a solution fh to the Poisson equation. Once we have fh , 
one can take the expectation of both sides above to see that 

(1.5) Eh{X^{oo)) - E/i(YW(oo)) = EG YW f h (xW(oo)). 

The Poisson equation (1.4) is a partial differential equation (PDE). Even when Y^\ oo) = 
Y (oo) (i.e. independent of £), one of the biggest challenges is obtaining bounds on the partial 
derivatives of fh(x ) (usually up to third order). We refer to these as gradient bounds. In 
the one-dimensional case, (1.4) is an ordinary differential equation (ODE) that usually has 
a closed form expression that one can analyze directly, see for instance [12, Lemma 13.1]. 
However, when d > 1 obtaining these gradient bounds becomes significantly harder. By 
exploiting probabilistic solutions to the Poisson equation, gradient bounds were established 
for cases when Y(oo) is a multivariate normal [ ], multivariate Poisson [ ] and multivariate 
Gamma [41]. 

Component 2. The next step is to produce the generator coupling. For that, we use 
the basic adjoint relationship (BAR) for the stationary distribution of X^(oo). One can 
check that a random vector X^\oo) € R rf has the stationary distribution of the CTMC 
X^ if and only if 

( 1 . 6 ) 


EG xW f(X^(oo))=0 
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for all functions / : —>• M that have compact support. For a given h, the corresponding 
Poisson equation solution //, does not have compact support. An important part of this 
step is to prove that (1.6) continues to hold for fh . Thus, it follows from (1.5) and (1.6) 
that 

(1.7) E/i(xW(oo)) - E/i(yW(oo)) = E[G YW f h {xW(oo)) - G xlt) f h (X®( oo))]. 

Note that two random variables in the left side of (1.7) are typically defined on two different 
probability spaces, whereas two random variables in the right side of (1.7) are all defined 
in terms of X^\oo), thus producing a coupling on a common probability space. 

To bound the right side of (1.7), we study 

(1-8) G x wfh{x) ~ G yW f h {x) 

for each x in the state space of X^K By performing Taylor expansion on G X (i)fh(x), we 
find that the difference involves the product of partial derivatives of fh and a term bounded 
by a polynomial of x. Therefore, in addition to gradient bounds on fh, in a lot of cases we 
need bounds on various moments of |X^(oo)| which we refer to as moment bounds. The 
main challenge is that both gradient and moment bounds must be uniform in t. 

Component 3. In the last step, SSC comes into play when itself is not a CTMC, 

but a projection of some higher dimensional CTMC IjG) = £ U,t > 0}, where 

the dimension of the state space U is strictly greater than d. This is the case, for example, 
in the M/Ph/n + M system. It is this difference in dimensions that is responsible for 
most of the computational speedup in diffusion approximations; most complex stochastic 
processing systems exhibit some form of SSC [6, 9, 16, 18, 20, 32, 33, 45, 52, 5 ]. Let Gjj 
be the generator of and U^\ oo) have its stationary distribution. Now, BAR (1.6) 
becomes G U (e)F(U^\ oo)) = 0 for each ‘nice’ F : IA —> M. Furthermore, (1.7) becomes 

(1.9) E/ l (xW(oo)) - E/i(yW(oo)) = E[G YW fh(X^(oo)) - G uW F h (U®( oo))], 

where Fh : U —> M is the lifting of fh : —>• R defined by letting x € W l be the projection 

of u € U and then setting 

(1-10) F h (u) = f h (x). 

As before, we can perform Taylor expansion on Fh(u) to simplify the difference 
Gu(e)Fh(u) — G Y wfh(x). To use this difference to bound the right side of (1.9), we need 
a steady-state SSC result for U^( oo), which tells us how to approximate oo) from 
X^\oo) and guarantees that this approximation error is small. To obtain our SSC result, 
we need to rely heavily on the structure of the M/Ph/n + M system. 

In [27], Gurvich develops methodologies to prove statements similar to (1.1) for var¬ 
ious queueing systems. In particular, Gurvich develops important elements of the first 
two components of our framework in the special case when dim(G) = d. Along the way, 
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he independently rediscovers many of the ideas central to Stein’s method in the setting 
of steady-state diffusion approximations. He relies on the existence of uniform Lyapunov 
functions for the diffusion processes. Putting the Lyapunov functions together with the 
probabilistic solution for (1.4) and a-priori Schauder estimates for elliptic PDEs (see [25]), 
he is able to obtain uniform gradient bounds for a large class of Poisson equations. Fur¬ 
thermore, he also obtains the necessary uniform moment bounds using these Lyapunov 
functions by showing that uniform moment bounds for the diffusion process imply the 
same moments are uniformly bounded for the CTMC. However, his result on uniform mo¬ 
ment bounds no longer holds when dim(7/) > d due to the need for SSC, which poses an 
additional technical challenge. We overcome this challenge for the M/Ph/n + M system 
in Lemma 7, in which moment bounds are established recursively. 

The work in [ ] is conceptually close to this paper. In that paper, Gurvich packages 

all the components required to prove his results into several conditions, with the main 
condition being the existence of uniform Lyapunov functions for the diffusion processes. In 
contrast, a key contribution of our framework is its modular nature. The immediate benefit 
we gain is the ability to apply this framework to cases when SSC occurs (dim (U) > d). 
Moreover, although we also rely on Lyapunov functions to establish both moment and 
gradient bounds in our particular setting, our framework clearly illustrates that Lyapunov 
functions are merely tools one can use to establish these moment and gradient bounds; the 
bounds themselves are the actual drivers of our main results. 

We have already mentioned that Lemma 13.1 of [12] presents a systematic way to estab¬ 
lish gradient bounds in the one-dimensional setting (d = 1), and [4, 5, 41] establish gradient 
bounds in the multi-dimensional setting (d > 1) for a few special cases of Y^(oo). How¬ 
ever, establishing multi-dimensional gradient bounds remains a very difficult problem that 
usually requires using structural properties of the distribution of Y^\ oo). Gurvich’s use 
of a-priori Schauder estimates [ ] together with Lyapunov functions represents the first 

systematic approach to establishing multi-dimensional gradient bounds. 

With regards to using Lyapunov functions to establish moment bounds, certain systems 
may not require moment bounds at all. For example, approximating the stationary distri¬ 
bution of the simple birth-death process corresponding to a single-server queue does not 
require the use of moment bounds (although we do not consider the M/M/1 queue in 
this paper, Stein’s method is easily applicable to it). Thus, the modularity of our frame¬ 
work presents the components one needs to justify approximations for various systems, 
and promotes the view that Lyapunov functions are merely one of many tools to tackle the 
difficulties in these components. 

It is useful to compare the challenge level of each component in our framework. The gen¬ 
erator coupling is the least challenging component, because the class of functions for which 
(1.6) holds is usually rich enough. The remaining major difficulties are moment bounds, 
gradient bounds and SSC. Moment bounds and SSC are a property of the CTMC sequence 
{X^}//L 1 , and the difficulty in establishing them will depend heavily on the CTMCs. On 
the other hand, gradient bounds are tied to the diffusion processes {y(L , and are typ- 
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ically only difficult to establish when the diffusion processes are multi-dimensional. One 
important class of multi-dimensional diffusion processes for which we do not have gradi¬ 
ent bounds are semi-martingale reflected Brownian motions (SRBMs) [34]. An SRBM can 
approximate networks of single-server queues, such as generalized Jackson networks. The 
Schauder gradient bounds of [27] are not immediately applicable to SRBMs, because the 
corresponding Poisson equation is defined on the non-negative orthant, and has oblique 
reflection boundary conditions. 

Stein’s method is a powerful method that has been widely used in probability, statistics, 
and their wide range of applications such as bioinformatics; see, for example, the survey 
papers [11, 47], the recent book [12] and the references within. The connection between 
Stein’s method and diffusion processes was first made by Barbour in [4, 5]. In the context 
of Stein’s method, generator coupling is a realization of an abstract concept that first 
appeared in the famous commutative diagram in (28) of [50]; a more refined explanation 
of which is provided in (4) of [ ]. In particular, using Chatterjee’s notation in [ ], our 

EG xW f h (xW{oo)) in (1.7) is his E Taf(W). 

Diffusion approximations are usually “justified” by heavy traffic limit theorems. It is 
proved in [15] that for our M/Ph/n + M systems, 

(1.11) X w = [I (A) (t), t > 0} =>- Y = {Y(t),t > 0} 

as A goes to infinity while satisfying (1.2) (we use the arrival rate A to index these systems 
instead of the abstract t as before). Proving these limit theorems has been an active area of 
research in the last 50 years; see, for example, [7, 8, 31, 36, 37, 46] for single-class queueing 
networks, [9, 43, 5 1] for multiclass queueing networks, [38, 5 ] for bandwidth sharing net¬ 
works, [15, 30, 41] for many-server queues. The convergence used in these limit theorems 
is the convergence in distribution on the path space O([0, oo),R d ), endowed with Skorohod 
Ji-topology [19, 53]. The Ji-topology on D([0,oo),M cZ ) essentially means convergence in 
D([0, T],M rf ) for each T > 0. In particular, it says nothing about the convergence at “oo”. 
Therefore, these limit theorems do not justify the steady-state convergence. 

In [ 3], the authors prove the convergence of distribution AW(oo) to that of Y(oo) by 
proving an interchange of limits. The proof technique follows that of the seminal paper 
!2], where the authors prove an interchange of limits for generalized Jackson networks of 
single-server queues. The results in [22] were improved and extended by various authors 
for networks of single-servers [10, 39, 56], for bandwidth sharing networks [ 5], and for 
many-server systems [21, 28, 51]. These “interchange limits theorems” are qualitative and 
thus do not provide rates of convergence as in (1.1). 

1.1. Notation. All random variables and stochastic processes are defined on a com¬ 
mon probability space (DjJ 7 , P) unless otherwise specified. For a stochastic process X = 
{X(t),t > 0} that has a unique stationary distribution we let X(oo) be the random element 
having the stationary distribution of X. For a sequence of random variables {X n }^ =1 , we 
write X n =>- X to denote convergence in distribution (also known as weak convergence) of 
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b 

X n to some random variable X. If a > 6, we adopt the convention that ^(-) = 0. For 

i=a 

an integer d > 1, M. d denotes the d-dimensional Euclidean space and denotes the space 
of d-dimensional vectors whose elements are non-negative integers. For a, b € M, we define 
aV b = max{a, b} and aAb = min{a, b}. For x € R, we define x + = xV0 and x~ = (— x) VO. 
For x € R d , we use Xi to denote its zth entry and |x| to denote its Euclidean norm. For 
x, y € M. d , we write x < y when x* < y\ for all i and when x < y we define the vector 
interval [x,y\ = {z : x < z < y}. All vectors are assumed to be column vectors. We let x T 
and A T denote the transpose of a vector x and matrix A, respectively. For a matrix A, we 
use A t j to denote the entry in the ith row and jth column. We reserve I for the identity 
matrix, e for the vector of all ones and e^' for the vector that has a one in the zth element 
and zeroes elsewhere; the dimensions of these vectors will be clear from the context. 

1.2. Outline for Rest of Paper. The rest of the paper is structured as follows. Section 2 
formally defines the M/Ph/n + M system as well as the diffusion process whose steady- 
state distribution will approximate the system. Section 3 states our main results. Section 4 
describes the CTMC representation of the M/Ph/n + M system. Section 5 introduces 
the first two components of our framework; the Poisson equation and generator coupling. 
Section 6 describes the SSC result, illustrating the third component of our framework. It 
is here that the reader may see the reason behind our slower rate of convergence. This 
framework is then used in Section 7 to prove our main results. Appendix A contains the 
proofs for most of the lemmas. 

2. Models. In this section, we give additional description of the M / Ph/n + M system 
and the corresponding diffusion model. 

2.1. The M/Ph/n + M System. The basic description of the M/Ph/n + M queueing 
system was given in the first paragraph of the introduction. Here, we describe the dynamics 
of the system. Upon arrival to the system with idle servers, a customer begins service 
immediately. Otherwise, if all servers are busy, the customer enters an infinite capacity 
queue to wait for service. When a server completes serving a customer, the server becomes 
idle if the queue is empty, or takes a customer from the queue under the first-in-first- 
out (FIFO) service policy if it is nonempty. Recall that the Ph indicates that customer 
service times are i.i.d. following a phase-type distribution. We shall provide a definition of 
a phase-type distribution shortly below. The phase-type distribution can approximate any 
positive-valued distribution [3, Theorem III.4.2], 

Recall that A denotes the arrival rate of the system. We use 1/a to denote the mean 
patience time. In our study, we take the service time distribution and a fixed, but allow the 
arrival rate A and the number of servers n to grow without bound. Throughout this paper, 
we assume that n follows the square-root-safety staffing rule in (1.2). In the pioneering 
paper of [30] , the authors studied these systems as A —>• oo and n grows to infinity following 
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(1.2). This parameter regime is now known as the Halfin-Whitt regime. In this regime, 
the system has high server utilization and at the same time has small customer waiting 
time and abandonment fraction. Therefore, this regime is also known as the quality- and 
efficiency-driven (QED) regime, a term coined by [23]. 

Phase-type Service Time Distribution. A phase-type distribution is assumed to have 
d > 1 phases. Each phase-type distribution is determined by the tuple (p,v,P), where 
p € M. d is a vector of non-negative entries whose sum is equal to one, v € W l is a vector of 
positive entries and P is a d x d sub-stochastic matrix. We assume that P is transient, i.e. 

( 2 . 1 ) (/ —P ) _1 exists, 

and without loss of generality, we also assume that the diagonal entries of P are zero 

(Pa = 0 ). 

A random variable is said to have a phase-type distribution with parameters (p, v, P ) if 
it is equal to the absorption time of the following CTMC. The state space of the CTMC 
is {1, ...,d + 1}, with d + 1 being the absorbing state. The CTMC starts off in one of the 
states in {1, ...,d} according to distribution p. For i = 1, the time spent in state i is 
exponentially distributed with mean I/ 17 . Upon leaving state i, the CTMC transitions to 
state j = 1 ,...,d with probability P,j , or gets absorbed into state d + 1 with probability 

The CTMC above is a useful way to describe the service times in the M/Ph/n + M 
system. Upon arrival to the system, a customer is assigned her first service phase according 
to distribution p. If the customer is forced to wait in queue because all servers are busy, she 
is still assigned a first service phase, but this phase of service will not start until a server 
takes on this customer for service. Once a customer with initial phase i enters service, her 
service time is the time until absorption to state d + 1 by the CTMC. We assume without 
loss of generality that for each service phase i, either 

( 2 . 2 ) pi > 0 or Pji > 0 for some j. 

This simply means that there are no redundant phases. 

We now define some useful quantities for future use. Define 

(2.3) R = (/ — P T )diag(^) and 7 = pR~ 1 p, 

where the matrix diag(^) is the d x d diagonal matrix with diagonal entries given by the 

d 

components of v. One may verify that y] 7 , = 1. One can interpret 7 j to be the fraction 

2=1 

of phase i service load on the n servers. 

For concreteness, we provide two examples of phase-type distributions when d = 2. 
The first example is the two-phase hyper-exponential distribution, denoted by H 2 . The 
corresponding tuple of parameters is (p, v, P ), where 

P = (Pi,P 2 ) T , v = {vi,V 2 ) T , and P = 0. 
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Therefore, with probability pi, the service time follows an exponential distribution with 
mean I/ 17 . 

The second example is the Erlang-2 distribution, denoted by E 2 . The corresponding 
tuple of parameters is (p,v,P), where 

P = (1) 0) T , v = (6,0) T , and P = jV 

An E 2 random variable is a sum of two i.i.d. exponential random variables, each having 
mean 1 /d. 

2.2. System Size Process and Diffusion Model. Before we state the main results, we 
introduce the process we wish to approximate, as well as the approximating diffusion 
process - the piecewise OU process. Recall that X = {A(f) € M. d ,t > 0} is the system size 
process, where 

X(t) = (X 1 (t),...,X d (t)) T , 

and Xi(t) is the number of customers of phase i in the system (queue + service) at time 
t. We emphasize that X is not a CTMC, but it is a deterministic function of a higher¬ 
dimensional CTMC, which will be described in Section 4. 

The process X depends on A ,n,a,p,P, and v. However, in this paper we keep a,p,P, 
and v fixed, and allow A and n to vary according to (1.2). For the remainder of the paper 
we write Al( a ) to emphasize the dependence of A on A; the dependence of on n is 
implicit through ( 1 . 2 ). 

Recall the definition of 7 in (2.3) and define the scaled random variable 

(2.4) IW(oo) = 6(XW( 00 ) - jn), 
where, for convenience, we let 

(2.5) 5 = 1/VX. 

To approximate A^ a ^(oo), we introduce the piecewise OU process Y = {Y(t),t > 0}. This 
is a d-dimensional diffusion process satisfying 

( 2 . 6 ) Y(t) = Y (0) — pf3t ~ R f (Y(s)-p(e T Y(s)) + )ds-ap [ (e T Y(s)) + ds + VEB(t). 

Jo Jo 

Above, B(t) is the d-dimensional standard Brownian motion and \/E is any d x d matrix 
satisfying 


VeVe 7 = E = diag(p) + ^ 7 k^kH k + (I - P 1 )diag(i/)diag( 7 )(/ - P), 

k =1 


(2.7) 
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where the matrix H k is defined as 

Hi = P ki ( 1 - P ki ), H% = -P ki P kj for j + i. 

Comparing the form of S above to (2.24) of [ ] confirms that it is positive definite. Thus 
exists. Observe that Y depends only on (5, a,p, P, and u, all of which are held constant 
throughout this paper. 

The diffusion process in (2.6) has been studied by [ ]. They prove that Y is positive 

recurrent by finding an appropriate Lyapunov function. In particular, this means that Y 
admits a stationary distribution. 

3. Main Results. We now state our main results. 


Theorem 1. For every integer m > 0, there exists a constant C m = C m (j3, a,p, v, P ) > 
0 such that for all locally Lipschitz functions h : —>■ M satisfying 


| h{x)\ < \x\ 2m for x € 


we have 


Eh(X( x \oo)) -Eh(Y(oo)) 

satisfying (1.2), which we recall below as 

np = A + /3VX 


< for all A > 0 
V A 


Theorem 1 will be proved in Section 7. As a consequence of the theorem, we immediately 
have the following corollary. 

Corollary 1 . There exists a constant C\ = C\(/3, a,p, v, P) > 0 such that 


sup 

fceWG) 


E/i(aA a ) (oo)) — Eh(Y (oo)) 


< for all A > 0 
v A 


satisfying (1.2), where is defined in (1.3). In particular, 

(oo) => y(oo) as A —>• oo. 

Proof. Suppose h G Without loss of generality, we may assume that h( 0) = 0, 

otherwise we may simply consider h(x) — h{ 0). By definition of W^ d \ 


\h(x)\ < |x| for x € 

and the result follows from Theorem 1 with m = 1. 


□ 


Remark 1. For any fixed f3 € R, there are only finitely many combinations of A € (0,4) 
and integer n > 1 satisfying (1.2). Therefore, it suffices to prove Theorem 1 by restricting 
A > 4, a convenience for technical purposes. 
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4. Markov Representation. The M/Ph/n + M system can be represented as a 


CTMC 


U W = {u( x \t),t > 0} 


taking values in U, the set of finite sequences {u\, ...,Uk} ■ The sequence u = {u\, ...,Uk} 
encodes the service phase of each customer and their order of arrival to the system. For 
example, the sequence {5,1,4} corresponds to 3 customers in the system, with the service 
phases of the first, second and third customers (in the order of their arrival to the system) 
being 5, 1 and 4, respectively. We use |n| to denote the length of the sequence u. The 
irreducibility of the CTMC is guaranteed by (2.1) and (2.2). 

We remark here that £A a ' is not the simplest Markovian representation of the M / Ph/n+ 
M system. Another way to represent this system would be to consider a d + 1 dimensional 
CTMC that keeps track of the total number of customers in the system, as well as the total 
number of customers in each phase that are currently in service; this dpi dimensional 
CTMC is used in [15]. In this paper we use the infinite dimensional CTMC U^ because the 
system size process X^ cannot be recovered sample path wise from the d+ 1 dimensional 
CTMC, it can only be recovered from U^ x \ Also, the CTMC U ^ will play an important 
role in our SSC argument in Section 6. 

In addition to the system size process we define the queue size process Q^ = 

{Q^ x \t) G > 0}, where 


Q {x \t) = (Q[ x \t),...,Q {x \t)) T , 


and Q[ X) (t) is the number of customers of phase i in the queue at time t. Then xj X> (t) — 
Q[ X \t) > 0 is the number phase i customers in service at time t. 

To recover X( x \t) and Q^ x \t) from U^ x \t), we define the projection functions Hx '■ 
U —> M d and IIq : U —> R d . For each u £U and each phase * € (1,..., d}, 


M M 

( n iW)i = and ( n Q( u ))*= 

k =1 k=n -\-1 

It is clear that on each sample path 

(4.1) X^ x \t) = n x {U W (t)) and Q W (t) = U Q {U W (t)) for t > 0. 

Because there is customer abandonment the Markov chain U^ can be proved to be positive 
recurrent with a unique stationary distribution [13]. We use U^( oo) to denote the random 
element that has the stationary distribution. It follows that A’^(oo) = IFy(C( a ) (oo)) has 
the stationary distribution of A( a ), and A4 a )(oo) in (2.4) is given by 

xW(oo) = 5(Ux{U^ x \oo)) - yn). 


(4.2) 
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For u £U, we define 

(4.3) x = 5(IIx(w) - 7 n ); Q = n<g(?x) and z = U x (u) - q. 

When the CTMC is in state u, we interpret (II^ Qi, and z; t as the number of the phase 

i customers in system, in queue, and in service, respectively. It follows that z >0. 

Let G V (x) be the generator of the CTMC U^ x \ To describe it, we introduce the lifting 
operator A. For any function / : R d —» R, we define Af : U —> R by 

(4.4) Af(u) = f(5(U x (u) - qn)) = f(x). 

Hence, for any function / : R d —> R, the generator acts on the lifted version Af as follows: 

d d 

GjjwAf(u) = E X Pi(f( x + 5e W ) - f(x)) + ^2aqi(f(x- <5e W ) - f(x)) 

i= 1 2—1 

d d 

+ y' J VjZi[y' j Pijf{x + <5e (i) - <5e w ) 
i =1 j =1 

d 

(4.5) + (1 - E - Se ®) - /(*) ■ 

1=1 

Observe that Gjj(\)Af(u ) does not depend on the entire sequence u; it depends on x, q, 
and the function / only. 

5. The Generator Coupling of Stein’s Method. This section is devoted to devel¬ 
oping a generator coupling of Stein’s method. This framework will be used in Section 7 to 
prove Theorem 1. 


5.1. Poisson Equation. The main idea behind Stein’s method is that instead of bound¬ 
ing 

(5.1) Eh(X (x) (oo)) - Eh(Y{oo)), 
one solves the Poisson equation 

(5.2) Gyfh(x) = h(x) - E/i(y(oo)), 

where the generator Gy of the diffusion process Y, applied to a function / € C 2 (M. d ), is 
given by 

dr d 

Gy fix) = Y dif(x) pi )3 - Vi{xi -piie T x) + ) - apiie T x) + + J] PjiVj{xj -pj(e T x) + ) 
i= 1 L 1=1 


(5.3) 


+\ S s ijdijf(x ) for x € R d . 

*,l=i 
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Then, to bound the difference in (5.1), it is sufficient to find a bound on 

(5.4) EGVA(xW(oo)). 

The following lemma, based on the results of [27], guarantees the existence of a solution 
to (5.2) and provides gradient bounds for it. The proof of this lemma is given in Section 
A.l. 


Lemma 1. For any locally Lipschitz function h : M. d —>• K satisfying |/i(a;)| < |x| 2m , 
equation (5.2) has a solution fh. Moreover, there exists a constant C(m , 1) > 0 (depending 
only on ((3, a,p, v, P)) such that for x € 


(5.5) 

(5.6) 

(5.7) 

(5.8) 


\dif h (x)\ 


I dijf h (x)\ 


sup 

y£R d :\y— x|<l 


I djjfhjy) - djjfhix )| 

I y - A 


< C(m, l)(l + |x| 2 ) m , 

< C(m, 1)(1 + |x| 2 ) m (l + |x|), 

< C(m, 1)(1 + |x| 2 ) m (l + |x|) 2 , 

< C(m, l)(l + |x| 2 ) m (l + |x|) 3 . 


5.2. Generator Coupling. Let denote the random variable Gyfh(X^( oo)) in 

(5.4). To prove |EL1T( A ) | small, a common approach in using the Stein’s method is to find 
a coupling for so that 


(5.9) 

(5.10) 


evl ( a) 


is small, and 


E 


— wW 


is small. 


Constructing an effective coupling is an art that is problem specific. See [ ] for a recent 

survey that includes examples of various couplings. 

We use = Gu^Af^iU^foo)) to construct the coupling, where A is the lifting 

operator defined in (4.4). The following lemma justifies the coupling propety (5.9). 

Lemma 2. Let h : M. d —>• K satisfy \h(x)\ < |x| 2m . The function fh given by (5.2) 
satisfies 

(5.11) EG uW Af h (U^(oo)) = 0. 


To prove the lemma, we need finite moments of the steady-state system size. 
Lemma 3. (a) Let L(u) = exp(e T ILv(u)) for u £lL. Then 

(5.12) EL([/ (A) (oo)) < oo. 

(b) all moments ofe T X^ x \ oo) are finite. 
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Proof. One may verify that 

G U (x)L(u) < A(exp(l) — 1 )L{u) — a(e T ILv(M) — n) + (l — exp(— l))L(u). 

It follows that there exist a positive constant C = C{ A, n, a) such that, whenever e T Iix{u) 
is large enough, 


(5.13) 


G^j(\)L{u) < — CL(u ) + 1 . 


Part (a) follows from [42, Theorem 4.2]. Part (b) follows from (5.12) and the equality 
e T U X (UW( °o)) = e T X^( oo). □ 

The function L(u) is said to be a Lyapunov function. Inequality (5.13) is known as a 
Foster-Lyapunov condition and guarantees that the CTMC is positive recurrent; see, for 
example, [42]. 

Proof of Lemma 2. A sufficient condition for (5.11) to hold is given by [35, Proposi¬ 
tion 1.1] (alternatively, see [26, Proposition 3]), namely 


(5.14) 


E 


G uW (U W ( oo),t/ (A) (oo)) Af h (U W ( oo)) 


< oo. 


Above, Gjj(x) (it, u ) is the uth diagonal entry of the generator matrix G V (x) ■ In our case, 
the left side of (5.14) is equal to 


= E 


= E 


< E 


G uW (t/ (A) (oo), U (A) (oo)) I I f h (X W (oo)) 


A + a(e T X < ' X \ oo) - n) + + ^ ^(A 4 - A) (oo) - q\ X \oo) 


iW i 


i— 1 


f h (X W (oo)) 


A + (a V m&x{i'i})e T X^ ( oo) 


f h (X (X H oo)) 


where the first equality follows from (4.2) and (4.4). One may apply (5.5) and (5.12) to see 
that the quantity above is finite. □ 

5.3. Taylor Expansion. To prove that the coupling IpA) = G U (x)Afh(U^ x \ oo)) satisfies 
the coupling property (5.10), we need to prove that 


E 


W W - JtW = E G uW Af h (uW(oo)) - G Y f h (X W ( oo)) 
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is small. For that, we compare the generator Gy(\) of the CTMC with Gy. By performing 
Taylor expansion on G U (x)Af/ l (u) in (4.5), one has 


Gjj(\)Afh{u ) 


(5.15) 


Y Xpi{5dif h (x ) + y<9uA(4 + )) + a( n{ - Sdif h {x) + y9jj/ h (^ )) 


d 

+Y viZi 

i =1 


d 

(1 - Y Pi j) ( “ Sdifh(x) + 


3 = 1 


rz a 

— diifh(^)) + Y Pi i ( ~ 5d ifh(x) 

3 =1 


A(x) + ^-dufhi&j) + - 5 2 d ij f h {£, 


u 


where ^ € [x, x + <feW], ^ € [x — <SeW , x] and £ij lies somewhere between x and x — 5e® + 
6 eV\ Using the gradient bounds in Lemma 1, we have the following lemma, which will be 
proved in Section A.2. 


Lemma 4. There exists a constant C(m , 2) > 0 (depending only on (/3, a,p, v, P)) such 
that for any u € G, 


Gu(x)Af h (u) - Gyfhix) 


(5.16) 


Y d ifh( x )\( 


d 


i =1 


Vi — a — Y^ PjiVj){ 5 qi - Pi( 

3 =1 


- p,;(e T x) + ' 


+ E(u), 


where q and x are as in (f.3), 5 as in (2.5), and E(u ) is an error term that satisfies 

\E(u)\ < 5C(m, 2)(1 + |x| 2 ) m (l + |x|) 4 . 


6. State Space Collapse. One of the challenges we face comes from the fact that 
our CTMC UW is infinite-dimensional, while the approximating diffusion process is only 
d-dimensional. Recall the process defined in (4.1) and the lifting operator A 

acting on functions / : W l —>• R, as defined in (4.4). When acting on the lifted functions 
Af{U^ x \ oo)), the CTMC generator G V {x) depends on both Af( A )(oo) and Q( a )(oo), but its 
approximation Gy f(X^( oo)) only depends on A4 a )(oo). This is captured in (5.16) by the 
term 


d 

Y 9 dh(x) 

1=1 



d 

- a ~Y p ji v j)( S Qi - Pi(e T x ) + ) 
3 = 1 
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To bound this term, observe that for any 1 < i < d, 
d 

(ui -a - ^ Pj^ dif h (x ) (Sqt - pi{e T x) + ) 

3 =i 

d 

= (vi-a-^2 p ji u 3 ) ( difh(x ) - dif h (x -5q+ p(e T x) + )^j (Sqi - pi{e T x) + ) 

3 =1 

d 

+ (uj-a~ ^ PjiV^dif h {x - 5q + p(e T x) + ) (% -pj(e T x) + ) 
j=i 

d d 

= (vi - a - ^2 Pji^j) y^Pikfh(Q($(lk - Pk{e T x) + ){5qi - pi{e T x) + ) 

j =1 fe=l 

d 

(6.1) + (i/j - a - y ^PjiVj)dif h (5(z - jn) +p(e T x) + ) (% - pj(e T x) + ) , 

j=i 

where z, defined in (4.3), is a vector that represents the number of customers of each type 
in service, and £ is some point between x and x — 5q + p(e T x) + . In particular, there exists 
some constant C that doesn’t depend on A and n, such that 

(6.2) |£| < |x| + 5\q\ + \p\ ( e T x) + <C\x\, 

because dqi < ( e T x) + for each 1 < i < d (i.e. the number of phase i customers in queue 
can never exceed the queue size). 

In order to bound the expected value of (6.1), we must prove a relationship between 
X( A )(oo) and Q^(po). Intuitively, the number of customers of phase i waiting in the 
queue should be approximately equal to a fraction pi of the total queue size. The following 
two lemmas bound the error caused by the SSC approximation. They are proved at the 
end of this section. 


Lemma 5. Let ZP\ oo) = X( A )(oo) — QP\oo) be the vector representing the number 
of customers of each type in service in steady-state. Then conditioned on {e T X^ x \oo)) + , 
the random vectors Q^\ oo) and ZP\ oo) are independent. Furthermore, 


(6.3) 


E 


SQ™ (oo) - p{e T X^ (oo))+ (e T !W (oo))' 


= 0 , 


and for any integer m > 0, there exists C(m , 3) > 0 (depending only on (/3, a,p, u, P)) such 
that for all A > 0 and n > 1 satisfying (1.2), 


(6.4) 


E 


2m n 


(SQM (oo) - p{e T xW (oo))+ " < S m C(m, 3)E[(e T i< A ) (oo)) 


where 6 = l/y/\ as in (2.5). 
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Lemma 6. For any integer m > 0, there exists C(m,A) > 0 (depending only on 
((5, a,p, v, P)) such that for any locally Lipschitz function h : -A M. satisfying \h(x)\ < 

\x\ 2m , and all A > 0 and n > 1 satisfying (1.2) 


dr d 

E E dif h (X W ( oo)) (vi-a-^P ji v j ){8Qf'\oo)-pi(e T x( x Xoo)) + ) 

3 = 1 


2—1 


(6.5) < 5C(m,4)E ((e T l< A >( °o))+) 2 


E 


1 + 


XW( 


oo. 


where fh(x ) is t/ie solution to the Poisson equation (5.2). 


Proof of Lemma 5. We begin by proving (6.4), for which it suffices to show that for 
ali A > 0 and n > 1 satisfying (1.2) 


E 


2m i 


(oo) — p{e T X^ (oo) — n) + < C(m, 3)E[(e T (oo) — n) + ] 


We first prove a version of (6.4) for any finite time t > 0. Then, (e T X^ (t) — n) + is the 
total number of customers waiting in queue at time t. Assume that the system is empty 
at time 2 = 0, i.e. X^(0) = 0. Fix a phase i. Upon arrival to the system, a customer is 
assigned to service phase i with probability pt. Consider the sequence {£j : j = 1,2,...}, 
where fj is one if the j th customer to enter the system was assigned to phase i. and zero 
otherwise. Then {£j : j = 1,2,...} is a sequence of iid Bernoulli random variables with 
P(£j = 1) = pi. For t > 0, define A(t) and B(t) to be the total number of customers to 
have entered the system, and entered service by time t, respectively. Also let Q (t) be the 
indicator of whether customer j is still waiting in queue at time t. Then 


( 6 . 6 ) 

(6.7) 


Alt ) 

(e T X( x \t) -n) + = E 


Alt) 

Q\ x \t)= E &<*(*)■ 

j=B(t )+1 


Let Z'At (t) = X^' (f) — ( t ) be the vector keeping track of the customer types in service 

at time t and let B(£,pf) be a binomial random variable with t € Z + trials and success 
probability pi. Assuming A"^(0) = 0, by a sample path construction of the process 
one can verify that for any time t > 0, the following three properties hold. First, for any 
2 € El}, a, b E E + with a > 1, and x ±,..., x a , y ±,..., y a € {0,1}, 

P(&+i =xi,... ,£ b +a = %a | A(t) = b + a,B(t) = 6,Z (A) (t) = z, 

Cfc+l = yii ■ ■ ■ > Cb+a = Va) 

= P(£i = ar)p(£ 2 = x 2) •••P(Co = X a ) 

= Pp= I! *(l-Pi) a -£?= 1X1 . 


( 6 . 8 ) 
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The right side of (6.8) is independent of b , z , y i,... ,y a - It then follows from (6.6), (6.7) 
and (6.8) that for any integer l > 1, g, € Z + , and z € Z+, 

P(Q? } (t) = qi | ( e T x( A )(t) - n)+ = = z) 

= F(Qf } (t) = Qi | (e T #(t) - n)+ = 

(6.9) =P(S(£,p,) = %). 


Since (6.9) holds for all t > 0, it holds in stationarity as well. 

We now say a few words about how to construct and argue (6.8)-(6.9). One would 
start with four primitive sequences: a sequence of inter-arrival times, potential service 
times, patience times, and routing decisions. The sequence of potential service times would 
hold all the service information about each customer provided they were patient enough to 
get into service. The routing sequence would represent the phase each customer is assigned 
upon entering the system. 

To see why (6.8) is true, we first observe that at any time t > 0, the random variable 
A(t) depends only on the inter-arrival time primitives; in particular, it is independent of 
the routing sequence {£,j,j > 1}. Second, any customer to arrive after customer number 
B(t) = b has no impact on any of the servers at any point in time during [0, t]. In particular, 
the primitives including > 1} associated to those customers are independent of 

B{t) = b and Z^\t). Lastly, the decisions of those customers whether to abandon or not 
by time t depends only on their arrival times, patience times, and the service history in the 
interval [0,t]. In particular, the sequence > 1} is independent of {^6+j, j > 1}. 

This proves the the first equality in (6.8). 

We now move on to complete the proof of this lemma. We use (6.9) to see that for any 
positive integer N, 


( 6 . 10 ) 


< 


®([Qi X) (t) -Pi(e T xM(t) - n) + } 2 m l {{e T X W( t )-n)+<N} 

N 

^E[(B(e, Pi ) -p^) 2m ]p((e T X (A) (f) -n)=t) 
t =l 
N 

C(m , 6)f"P((e T X( A ) (t) - n) = l) 

{(e T X( x ) (t)—n)+ <N} j i 


1=1 


C(m,6)E^[(e T X^\t) - n)+Pl 


where we have used the fact that there is a constant C(m, 6) > 0 such that 


E 


(B(e, Pi )- Pi £) 2m 


< C(m, 6 )r 


for all £ > 1; 


see, for example, (4.10) of [40] . Letting t —> oo in both sides of (6.10), by the dominated 
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convergence theorem, one has 

- Pi(e T X {X \oo) - n) + ] 2 m l {(e T_ Y (A )(oo) _ n) +< 7V} ) 

< C(m, 6)K^[(e T (oo) - n) + ] m l {(e T A - ( A )(oo) _ ri) +< 7V} ) . 

Letting N —>• oo, by the monotone convergence theorem, one has 

E(Qf ) (oo) — pi(e T X^ x \ oo) — n) + ) 2m < C(m, 6)E[(e T X^ A - l (oo) — n) + ]' n . 

Then (6.4) follows from this inequality for each i and the fact that there is a constant 
B m > 0 such that \x\ 2m < B m Yli=i( x i) 2m f° r x £ One can check that (6.3) can be 
obtained by an argument very similar to the one used to prove (6.4). □ 


Proof of Lemma 6. Recall that 

Z (A) ( oo) = X (A) (oo)-Q (A) (oo) 


is the vector representing the number of customers of each type in service in steady-state. 
Then from (6.1) we have 


E 


dif h (X {X) (oo)) (5Q[ X) (oo) - Pi (e T xM (oo))+) 

d ik f h (0{5Qi X \°°) ~ Pk(e r X W (oo)) + )(6Q ( : X \oo) - p i (e T X ( - A) (oo)) + ) 
dif h (s(Z w (oo) -7 n ) + p(e T X W (oo)) + ^j (5Q[ X \oo) -pi(e T X^ A \ oo)) + ) 


k =1 

+ E 


By Lemma 5, the second expected value equals zero. For the first term, one can use the 
Cauchy-Schwarz inequality, together with the gradient bound (5.7) and the SSC result 
(6.4) to see that for all 1 < i, k < d, 


E 


dikfh{£,){$Q[ X \oo) - p k (e T X w (oo)) + ) (5Q[ X \oo) -pj(e T X (A) (oo)) + ) 


< 


1 


E 


{dtkfhioy 


'E (<5Qj; A) (oo) - p fc (e T X( A )(oo))+y E ^<5Q^(oo) - pi(e T XW (oo)) 


< SC( 2, 3)E[(e T lW (oo ))+] 2 JE [(d ik f h (0) 

< 5C'(2,3)E[(e r x( A )(oo)) + ] 2 C(m,l),/E[(l + |C| 2 ) 2 (l + |C|) 4 
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We now combine everything together with the fact that £ satisfies (6.2) to conclude that 
there exists a constant C(m,4) that does not depend on A or n, such that 


d r d 

^<9iIE f h (X W ( oo)) (vi-a-^Pj i v j ){8Q^\oo)-pi(e T x( x Xoo)) + ) 
L j=l 


i =1 


< 5C (m, 4)E [(e T X^ ( oo)) + ] 2 
which concludes the proof of the lemma, 


E 


1 + 


1W( 


oo. 


□ 


7. Proof of Theorem 1. To prove Theorem 1, we need an additional lemma on 
uniform bounds for moments of scaled system size. It will be proved in Section A.3. 

Lemma 7. For any integer m > 0, there exists a constant C(m, 5) > 0 (depending only 
on (/3, a,p, v, P)) such that 


(7.1) 


E 


X (A) ( 


oo 


< C(m, 5). 


We remark that in the special case when the service time distribution is taken to be 
hyper-exponential, it is proved in [ 1] that 


lim sup E exp ( 8 

A—>-oo 


A (a) (oo) 


< oo 


for 8 in some positive interval. The proof relies on a result that allows one to compare 
the system with an infinite-server system, whose stationary distribution is known to be 
Poisson. 


Proof of Theorem 1. It follows from Lemmas 4 and 6 that 


Eh(X^ x \oo))-Eh{Y{oo)) 


E G uW Af h (U^(oo)) - EG Y f h (X^(oo)) 



d 



d 

< 

E e 

d t f h (X^( oo)) 

{Vi - a 

- P a v i)( S Qi X \°°) - Pi(e T X [X) ( oo)) + ) 


i —1 



3 = 1 


+5C(m, 2)E[(1+ X (a) (oo) ) m (l + A (a )(oo) ) 4 
< 5C(m, 4)E[((e r l( A >( oo))+) 2 


E 


1 + 


XW(oo) 


+<5C'(m,2)E[(l+ I (A) (oo) ) m (l + jW(oo) ) 4 


(7.2) 
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By Lemma 7, there are constants Bi(m), B 2 (m) > 0 (depending only on (j3,a,p,u, P)) 
such that 


((e T l( A )(oo)) + )' 


E 


1 + 


1W( 


oo ; 


< B^m), 


(1+ X (A) ( °o)| ) m (l + |x( A )(oo)|) 4 l <B 2 {m). 


Therefore, the right side of (7.2) is less than or equal to 

5C(m,A)Bi(m) + dC(m, 2 )B 2 (m) 

[c(m,P)Bi{m) + C (m,2)B 2 (m) S j ~^= for A > 0. 


< 


This concludes the proof of Theorem 1. 


□ 
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APPENDIX A: PROOFS 

A.l. Proof of Lemma 1 (Gradient Bounds). Before proving the lemma, we first 
state the common quadratic Lyapunov function introduced in [17]. This Lyapunov function 
plays a key role in our paper. As in (5.24) of [17], for x € M d , define 

(A.l) U(x) = (e T x) 2 + k[x — p<j){e T x)]'M[x — p4>(e T x)\, 


where k > 0 is some constant, M is some d x d positive definite matrix, and the function 
0 is a smooth approximation to x i —> x + and is defined by 




x, if x > 0, 

< —|e, if x < —e, 
smooth, if — e < x < 0. 


In (5.24) of [ 7], the authors use Q to represent the positive definite matrix that we called 
M in (A.l). We use M instead of Q on purpose, to avoid any potential confusion with 
the queue size Q(t). For our purposes, “smooth” means that 4> can be anything as long as 
<f> € C 3 (M c! ). We require that the “smooth” part of <j) also satisfies — \t < 4>(x) < x and 
0 < 4>'(x) < 1. For example, cj) can be taken to be a polynomial of sufficiently high degree 
on (—e,0) and this will satisfy our requirements. The vector p is as in (2.6). The constant 
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k and matrix M are chosen just as in [ 7]; their exact values are not important to us. In 
their paper, they show that V satisfies 

GyV(x) < —c\V{x) + C 2 for all x € W l 

for some positive constants ci,C 2 ; this result requires a > 0, i.e. a strictly positive aban¬ 
donment rate. Before proceeding to the proof of Lemma 1, we state two bounds on V that 
shall be useful in the future. For some constant C > 0, 

(A.2) V{x) < C(1 + |x| 2 ), 

(A.3) \x\ 2 <C(l + V{x)). 

The first is immediate from the form of V, while the second is proved in [17]. 

Proof of Lemma 1. Without loss of generality, we may assume that h( 0) = 0, oth¬ 
erwise one may consider h(x) — h( 0). This lemma is essentially a restatement of equation 
(22) and equation (40) from the discussion that follows after [27, Theorem 4.1]. We verify 
that (22) and (40) are applicable in our case by first confirming that we have a function 
satisfying assumption 3.1 of [27]. Recalling the definition of V from (A.l), when (j> is taken 
to be a polynomial (of sufficiently high degree to guarantee V € C 3 (R“)), the function 

1 + V{x) 

satisfies assumption 3.1. To verify condition (17) of Assumption 3.1, one observes that 

X w (t) < X (A )(0 ) + n + A w (t), 

where A^ x \t) is the total number of arrivals to the system by time t and it is a Poisson 
random variable with mean A t for each t > 0. The properties of Poisson processes then 
yield (17). By [27, Remark 3.4], 

C{l + V{x)) m 

also satisfies assumption 3.1 for any constant C > 0. Since we require that |/i(x)| < |x| m , 
by (A.3) we have 

| h(x) - Eh(Y(oo))| < \x\ m + E \Y (oo)| m < C m ( 1 + V(x)) m . 

The finiteness of E |Y(oo)| m is guaranteed because one of the conditions of assumption 3.1 
is that 

G v ( 1 + V{x)) m < ci(1 + V{x)) m + c 2 

for some positive constants c\ and C 2 - Therefore, equation (22) gives us (5.5) and equation 
(40) gives us (5.6) and (5.7). We get (5.8) by observing that in the discussion preceding 
(40), everything still holds if we replace B x (l/y/n ) by an open ball of radius 1 centered at 
x. We wish to point out that the constants in (40) and (22) do not depend on the choice 
of function h. □ 
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A.2. Proof of Lemma 4 (Generator Difference). The main idea here is that 
Gyfh(x) is hidden within G U (x)Aff l (u), where the lifting operator A is in (4.4). We alge¬ 
braically manipulate the Taylor expansion of G U (x)Aff l (u ) to make this evident. First, we 
first rearrange the terms in the Taylor expansion (5.15) to group them by partial deriva¬ 
tives. Thus, G U (\)Afh(u) equals 


y, Sdif h (x) Pi\ - aqi - v t z t + 


1=1 


3 = 1 


jiVjZj 


d 


d 


ij Vi Zi 


+ y — dafhix) PiX + aqi + ViZi + y Pji^jZj - y S 2 dijf h (x ) [P { 

i=l j =1 ijtj 

d g 2 d 

+ y y (diifhiil) - duf h (xfj aqi + (1 - y Pij)uiZi 

i= 1 j =1 

dr -2 d 

+ y Y ) - d«/ft(*)) [Api] - y 6 2 (dijfhdij) - dijf h (x )) 


i=1 


+yy y (dnfh^ij) - duf h {x 


*7 

PijlZiZi T PjilZjZj 


i =1 1=1 


To proceed we observe that (2.3) gives us the identity 


d 

(A.4) - I'i'-fin + y PjiVj^jn = -npj. 

1=1 


Recall the form of Gyfh(x) from (5.3). From the form of E in (2.7), we see that 


d 

(A.5) Ejj = 2(y + y PjijjVj ) , Ejj = -{PijVili + PjiVjlj) for j / i 

l=i 

using (5.3), (A.4) and (A.5), the difference G U (x)Afh(u) — Gyfh(x) becomes 
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(A.6) ^2 dif h (x) (ui-a-^2 PjiVj)(5qi - Pi{e T x) + ) 


i= 1 


3 = 1 


+ 


^2 diif h (x)[^2P j iV j lj {n6 2 - 1) - d ijfh{x) 

*7 H 


Pij^ili + PjiVjlj (nS 2 - 1) 


i =1 


j=l 


<5 2 


y: —dafh(x) Pi( A - n) - aqi - v&i - an) - ^ Pji^jizj - ^n) 


i= 1 


j=i 


X] iydijf h (x) PijMzi - 7* n ) + Pyl'Azj - ijn) 


i+3 


6 2 


+ Y2^-{dnIh(£ l ) ~ dafhix )) ag* + (1 - 


2—1 


j = l 


+ yy(dufh(&) ~ duf h (x )) 


APi 


2—1 
d d 


Y^Pidijfkfej) - dijf h (x)) 
#3 


Pij l^iZi 


PijViZi PjiVjZj 


+ 5^5^7r($i/h(&j) - dafhix)) 
i= 1 i=l 

We remind the reader that our target is to prove that 


Gu(\)Afh{u) — Gyfh{x) 


J2dif h (x)\{ 


2=1 


d 

Vi-a -^2 PjiVj){5qi - pii 
3 = 1 


- Pi(e T x) 


+ E{u), 


where -B(tt) is an error term that satisfies 


|£(u)| < <SC(m,2)(l + |x| 2 r(l + |x|) 4 . 


We choose E[u) to be all the terms in (A.6) except for the first line. We now describe how 
to bound \E(u)\. Most of the summands in (A.6) look as follows: a term in large square 
brackets multiplied by some partial derivative of fh■ The partial derivatives are very easy 
to bound; we simply use (5.6) - (5.8). We wish to point out that ff, £7 and fij lie within 
distance 26 of x. When 26 < 1, (5.8) implies 

(A.7) | dijhiO ~ dijhix) | < 26C(1 + |x| 2 ) m (l + |x|) 3 

for some constant C > 0 (i.e. an extra 6 term is gained). When 26 > 1 (by Remark 1 this 
occurs in finitely many cases), we may use (5.7) to obtain (A.7) with a redefined C. From 
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here on out, we shall let C > 0 be a generic positive constant that will change from line to 
line, but will always be independent of A and n. 

Now we shall list the facts needed to bound all the square bracket terms in (A. 6 ) except 
for the very first one. Recall that we are operating in the Halfin-Whitt regime as defined 
by (1.2). Therefore, 

(■ nS 2 — 1) = 5/3 and <5(A — n) = —/3. 

Furthermore, it must be true that 

Sqi < ( e T x) + <C\x\, 

as the number of phase i customers may never exceed the total queue size. Next, 

1 5(zi - 7 in)| = |Xi - Sqt\ < C |x| 

and lastly, 

\S 2 Zi\ < |<5 2 7in| + 1 5 2 (zi - 7 *n)| < C( 1 + |x|). 

It is now a simple matter to verify that the inequalities above, combined with the bounds 
on the partials of fh are all that it takes to achieve our desired upper bound. 


A.3. Proof of Lemma 7 (Moment Bounds). We first provide an intuitive roadmap 
for the proof. The goal is to show that a Lyapunov function for the diffusion process is 
also a Lyapunov function for the CTMC; this has two parts to it. In the first part of this 
proof, we compare how the two generators G V (\) and Gy act on this Lyapunov function, 
obtaining an upper bound for the difference G V (\) — Gy in (A. 12 ). One notes that the 
right hand side of (A. 12) is unbounded. This is due to the difference in dimensions of the 
CTMC and diffusion process. To overcome this difficulty, we move on to the second part 
of the proof, which exploits our SSC result in Lemma 5 to bound the expectation of the 
right hand side of (A. 12). We end up with a recursive relationship that guarantees the 
2 mth moment is bounded (uniformly in A and n satisfying ( 1 . 2 )) provided that the mth 
moment is. Finally, we rely on prior results obtained in [13] for a uniform bound on the 
first moment. 

We remark that a version of this lemma was already proved [27, Theorem 3.3] for the case 
where the dimension of the CTMC equals the dimension of the diffusion process. However, 
the difference in dimensions poses an additional technical challenge, which is overcome in 
the second part of this proof. 

Its enough to prove (7.1) for the cases when m = 2 J for some j > 0. Furthermore, we 
may assume that A > 4 because by Remark 1, there are only finitely many cases when 


A < 4. In all those cases, E 


X'-N(oo) < oo by (5.12). Throughout the proof, we shall use 
G, Ci, C 2 , C 3 , C 4 to denote generic positive constants that may change from line to line. 
They may depend on (m, (3, a,p, is, P ), but will be independent of both A and n. Define 


v m ( X ) = a+v(x)) m 
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where V is as in (A.l). By [27, Remark 3.4], V m also satisfies 

G Y V m (x ) < -C 1 V m (x) + C 2 

as long as V 6 (7 3 (R d ) and satisfies condition (30) of [27], which is easy to verify. To prove 
the lemma, we will show that for large enough A, V satisfies 

E G uW AV m (uW(oo)) < -C 1 EV m (X^(oo)) + C 2 , 

where A is the lifting operator defined in (4.4). We begin by observing 


(A.8) G uW AV m < G vW AV m - G Y V m + G Y V m < G uW AV m - G Y V m - C\V m + C 2 . 
Using (A.6), we write G u (x)AV m — G Y V m as 


E diV m (x ) (i/i-a-X; / 7 '' / ,)( , % - Pi(e T x) + ) 


2=1 


3 = 1 


a a a 

T y ) dgVm(x) ^ " Pji^j^j 1) ^ ^ 9jj I'm (x) Pijl / i r li T PjiVj'Yj 1) 

*=i i =1 bU 


j=i 

d 5 2 


E —diiV m (x) Pi( A - n) - otqi - 17 (z; - 7*n) - E PjiVj{zj - Ijn) 


2=1 


j=i 


^ 0 

E —dijV m (x) PijVi{zi - 7i«) + PjiVj(zj - 7 jn) 




5 2 


+ E ~ a Qi + C 1 “ E 


2=1 


j=i 


+ E y( ,9 u'MC + ) - duV m (x)) 


\Pi 


2=1 


+ E E WO ~ dnV m (x)) 


E PidijVmiZii) - dijVmix)) 


Pij Vi Zi T Pj i V j Zj 


Pij V'j Zj 


i =1 J=1 


Now we wish to bound the derivatives of V m . By [27, Remark 3.4], V m satisfies (16) and 
(30) of [27], namely 


sup 
\y\< i 


Vm(x + y) 

V m (x) 


< c 


(A.9) 
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and 


(A.10) {\diV m {x)\ + \dijV m {x)\ + \d ijk V m {x)\)(l + |x|) < CV m (x). 

For £ being one of £“ or 

(A.11) - 9j J -V r m (a:)| (1 + |x|) < 5 \d i:ji V m (r]) + <%F m ( q)\ (1 + |x|) < C6V m (x), 


where the first inequality comes from a Taylor expansion and the second inequality follows 
by (A.10), the fact that \r] — x\ < 25 < 1 and by (A.9). Following the exact same argument 
that we used to bound (A.6) in the proof of Lemma 4 (with (A. 10) and (A. 11) replacing 
the gradient bounds of fh there), we get 




d 

GyV m < CSV m (x) + C^2 \diV m {x)\ 

i —1 


qi~Pi{e T x) + 


Differentiating V, we see that 

('VV(x)) 1 = 2(e r x)e T + 2k(x 7 — p T 4>(e T x))Q(I — pe T $ (e T x)). 


Combined with the fact that 0 < 4>'(x) < 1, it is clear that 


\diV{x)\ < C(l + \x\). 


Therefore, 

d 

(A.12) G uW AV m -G Y V m <C5V m (x)+C'Y^mV rn - i(x)(l + |x|) | qi~pi(e T x) 

i=l 

It remains to find an appropriate bound for 

V m -i(x)(l + |cc|) \qt -pi(e T x) + \ = 5V m -i(x)(l + \x\) 


\q% -pi(e T x)~ 
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We have 


5V m - i(x)(l + |x|) 


I qi ~Pi(e T x ) + 1 


< \/5y m _i(x)(l + \x\) 2 + VdVm-i^x) 


\Qi -Pi(e X ) 


T„\ + I 2 


< cV6V m (x ) + \/5U m _ 2 (x)U 2 (a;) + VSV m _ 2 (x) 


k — pi(e T x) + 


,T„\ + l 2n 2 


k - P;(e T x)+ 


21 4 


< Ca/W^x) + V5V m {x) + VdV m -4(x)V4(x) + VdV m -4(x) 

< ... 

(A.13K CVSV m (x) + V5 \ qi ~ Pi ( e x ) +l 


where in the last inequality, we used the fact that m = 2 3 . Using (A.8), (A.12) and (A.13), 


G uW AV m (u) < -F m (s)(Ci - VdC 3 ) + C 2 + V~5 C a ^2 


i= 1 


k ~Pi(e T x ) + 1 


21 m 


where x and q are related to u by (4.3). The arguments in the proof of Lemma 2 can be 
used to show 

E G uW AV m (U™( oo)) = 0. 

Therefore, for 5 small enough, 

2m 


1W( 


OO, 


< CEV m (X^(oo)) 
C 


< 


(Cl - vsc 3 ) 


d E <5Q.[^(oo) — pi(e T (oo)) + 

c 2 + Vsc 4 £---»- 


2 m 


i =1 


By (6.4), it follows that 


E 


xW(oo) 


2m 


< 


c 


Cl - V^Cg 


l + V6E[(e T X^\ oo)) 


Hence, we have a recursive relationship that guarantees 


supE 

A>0 


X(U(, 


oo, 


2m 


< OO 
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whenever 

supE[(e T X ( ' A ' ) (oo)) + ] m < oo. 

A>0 

To conclude, we need to verify that 

supE[(e T X < - A )(oo)) + ] < oo, 

A>0 

but this was proved in equation (5.2) of [13]. 
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